Representing and Accessing Multi-Level Annotations in MMAX2

نویسنده

  • Christoph Müller
چکیده

MMAX21 is a versatile, XML-based annotation tool which has already been used in a variety of annotation projects. It is also the tool of choice in the ongoing project DIANA-Summ, which deals with anaphora resolution and its application to spoken dialog summarization. The project uses the ICSI Meeting Corpus (Janin et al., 2003), a corpus of multi-party dialogs which contains a considerable amount of simultaneous speech. It features a semiautomatically generated segmentation in which the corpus developers tried to track the flow of the dialog by inserting segment starts approximately whenever a person started talking. As a result, the corpus has some interesting structural properties, most notably overlap, that are challenging for an XML-based representation format. The following brief overview of MMAX2 focuses on this aspect, using examples from the ICSI Meeting Corpus.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-level annotation of linguistic data with MMAX2

This paper describes how richly annotated corpora can be created with the annotation tool MMAX2. The description is from the point of view of Computational Linguistics, a discipline where annotated corpora are often used as resources for software development. The paper outlines the important steps in the life cycle of an annotation and details how the tool MMAX2 can be employed in each of them.

متن کامل

EXCOTATE: An Add-on to MMAX2 for Inspection and Exchange of Annotated Data

In this paper, we present an add-on called EXCOTATE for the annotation tool MMAX2. The addon interacts with annotated data stored in and spread over different MMAX2 projects. The data can be inspected, revised, and analyzed in a tabular format, and will be reintegrated into MMAX2 projects afterwards. It is based on Microsoft Excel with extensive usage of the script language Visual Basic for App...

متن کامل

Representing and Accessing Multilevel Linguistic Annotation using the MEANING Format

We present an XML annotation format (MEANING Annotation Format, MAF) specifically designed to represent and integrate different levels of linguistic annotations and a tool that provides flexible access to them (MEANING Browser). We describe our experience in integrating linguistic annotations coming from different sources, and the solutions we adopted to implement efficient access to corpora an...

متن کامل

Representing Multimodal Linguistics Annotated data

The question of interoperability for linguistic annotated resources requires to cover different aspects. First, it requires a representation framework making it possible to compare, and potentially merge, different annotation schema. In this paper, a general description level representing the multimodal linguistic annotations is proposed. It focuses on time and data content representation: This...

متن کامل

Using Semantic Metadata for Discovery and Integration of Heterogeneous Ecological Data

Effective discovery and integration of ecological data within data management systems requires rich semantic information that can describe and relate the types of information contained within disparate data sets. Within the Semtools project, we have developed approaches for expressing and representing semantic annotations of data sets for supplementing attribute and data-level metadata with ter...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006